AITopics | accelerate training

AutoAssist: A Framework to Accelerate Training of Deep Neural Networks

Neural Information Processing SystemsDec-25-2025, 19:02:40 GMT

Deep neural networks have yielded superior performance in many contemporary applications. However, the gradient computation in a deep model with millions of instances leads to a lengthy training process even with modern GPU/TPU hardware acceleration. In this paper, we propose AutoAssist, a simple framework to accelerate training of a deep neural network. Typically, as the training procedure evolves, the amount of improvement by a stochastic gradient update varies dynamically with the choice of instances in the mini-batch. In AutoAssist, we utilize this fact and design an instance shrinking operation that is used to filter out instances with relatively low marginal improvement to the current model; thus the computationally intensive gradient computations are performed on informative instances as much as possible. Specifically, we train a very lightweight Assistant model jointly with the original deep network, which we refer to as Boss.

accelerate training, autoassist, name change, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.59)

Add feedback

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Neural Information Processing SystemsNov-21-2025, 15:33:34 GMT

By reparameterizing the weights in this way we improve the conditioning of the optimization problem and we speed up convergence of stochastic gradient descent. Our reparameterization is inspired by batch normalization but does not introduce any dependencies between the examples in a minibatch. This means that our method can also be applied successfully to recurrent models such as LSTMs and to noise-sensitive applications such as deep reinforcement learning or generative models, for which batch normalization is less well suited. Although our method is much simpler, it still provides much of the speed-up of full batch normalization. In addition, the computational overhead of our method is lower, permitting more optimization steps to be taken in the same amount of time. We demonstrate the usefulness of our method on applications in supervised image recognition, generative modelling, and deep reinforcement learning.

accelerate training, simple reparameterization, weight normalization, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.61)

Add feedback

AutoAssist: A Framework to Accelerate Training of Deep Neural Networks

Neural Information Processing SystemsMay-27-2025, 13:49:35 GMT

Deep neural networks have yielded superior performance in many contemporary applications. However, the gradient computation in a deep model with millions of instances leads to a lengthy training process even with modern GPU/TPU hardware acceleration. In this paper, we propose AutoAssist, a simple framework to accelerate training of a deep neural network. Typically, as the training procedure evolves, the amount of improvement by a stochastic gradient update varies dynamically with the choice of instances in the mini-batch. In AutoAssist, we utilize this fact and design an instance shrinking operation that is used to filter out instances with relatively low marginal improvement to the current model; thus the computationally intensive gradient computations are performed on informative instances as much as possible. Specifically, we train a very lightweight Assistant model jointly with the original deep network, which we refer to as Boss.

accelerate training, autoassist, deep neural network, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Neural Information Processing SystemsFeb-11-2025, 20:34:42 GMT

By reparameterizing the weights in this way we improve the conditioning of the optimization problem and we speed up convergence of stochastic gradient descent. Our reparameterization is inspired by batch normalization but does not introduce any dependencies between the examples in a minibatch. This means that our method can also be applied successfully to recurrent models such as LSTMs and to noise-sensitive applications such as deep reinforcement learning or generative models, for which batch normalization is less well suited. Although our method is much simpler, it still provides much of the speed-up of full batch normalization. In addition, the computational overhead of our method is lower, permitting more optimization steps to be taken in the same amount of time.

batch normalization, simple reparameterization, weight normalization, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.64)

Add feedback

Reviews: AutoAssist: A Framework to Accelerate Training of Deep Neural Networks

Neural Information Processing SystemsJan-26-2025, 00:10:02 GMT

The theoretical study of instance shrinkage in pegasos is as far as I know novel and interesting. Specially interesting is how instance shrinkage does not affect the solution the model converges to, which justifies later experiments which ignore importance sampling in deep nets. Similarly, the idea of training a small assistant model just to predict the loss of the base model on unseen examples is straightforward and potentially useful. The algorithm is clearly described, including all hyperparameters, and it does look like it should be possible to replicate the experiments. It's unclear from reading the experimental section, however, that this algorithm is actually an improvement over just regular training with no curriculum attached.

accelerate training, deep neural network, hyperparameter, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Reviews: AutoAssist: A Framework to Accelerate Training of Deep Neural Networks

Neural Information Processing SystemsJan-26-2025, 00:09:51 GMT

This paper addresses an important problem and the empirical results look promising. The method is simple and clearly presented. For making this work more convincing, as pointed out by the reviewers, it would be nice to add tuned SGD/momentum baseline, and have a thorough discussion with related work.

accelerate training, autoassist, deep neural network

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Reviews: Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Neural Information Processing SystemsJan-20-2025, 22:12:06 GMT

The suggested reparametrisation and its theoretical analysis are very interesting and I enjoyed reading the paper. However, some points in the theoretical analysis could be improved: The paper argues that the new parametrisation improves the conditioning matrix of the gradient, but neither a strong theoretical argument nor a empirical demonstration for this are given. In line 127 it is said "Empirically, we find that w is often (close to) a dominant eigenvector of the covariance matrix C", but the correspond experiments are neither shown in the paper nor in the supplemental material. In line 122/123 the authors claim "It has been observed that neural networks with batch normalization also have this property (to be relatively insensitive to different learning rates), which can be explained by this analysis.". However, it did not became clear to me, how the analysis of the previous sections can be directly transferred to batch normalisation.

gradient, neural network, simple reparameterization, (11 more...)

Neural Information Processing Systems

Genre: Research Report (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

AutoAssist: A Framework to Accelerate Training of Deep Neural Networks

Neural Information Processing SystemsOct-10-2024, 14:36:41 GMT

Deep neural networks have yielded superior performance in many contemporary applications. However, the gradient computation in a deep model with millions of instances leads to a lengthy training process even with modern GPU/TPU hardware acceleration. In this paper, we propose AutoAssist, a simple framework to accelerate training of a deep neural network. Typically, as the training procedure evolves, the amount of improvement by a stochastic gradient update varies dynamically with the choice of instances in the mini-batch. In AutoAssist, we utilize this fact and design an instance shrinking operation that is used to filter out instances with relatively low marginal improvement to the current model; thus the computationally intensive gradient computations are performed on informative instances as much as possible. Specifically, we train a very lightweight Assistant model jointly with the original deep network, which we refer to as Boss.

accelerate training, autoassist, deep neural network, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reduce deep learning training time and cost with MosaicML Composer on AWS

#artificialintelligenceOct-24-2022, 19:31:20 GMT

In the past decade, we have seen Deep learning (DL) science adopted at a tremendous pace by AWS customers. The plentiful and jointly trained parameters of DL models have a large representational capacity that brought improvements in numerous customer use cases, including image and speech analysis, natural language processing (NLP), time series processing, and more. In this post, we highlight challenges commonly reported specifically in DL training, and how the open-source library MosaicML Composer helps solve them. DL models are trained iteratively, in a nested for loop. A loop iterates through the training dataset chunk by chunk and, if necessary, this loop is repeated several times over the whole dataset.

composer, mosaicml composer, reduce deep learning training time, (11 more...)

#artificialintelligence

Country: Europe > France (0.05)

Industry: Retail > Online (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

59th MDW: Alamo Spark Cell drives innovation throughout the Air Force

#artificialintelligenceNov-17-2021, 00:45:21 GMT

Throughout the Air Force, teams referred to as Spark Cells serve as a hub for innovation. The 59th Training Group's Alamo Spark Cell is a collaborative team that focuses on improving training at the Medical Education and Training Campus. "Our Spark Cell team works with the whole campus here and also works with the Air Force Medical Modeling and Simulation Training at Randolph," said Tech. "We have every person we can get involved within the campus, and we brainstorm ideas. We ask ourselves, how can we innovate and accelerate training?" Even during the pandemic, these innovators have implemented new ideas to help improve their students' education.

air force, hauversburk, spark cell, (10 more...)

#artificialintelligence

Industry:

Education (1.00)
Government > Military > Air Force (0.95)

Technology: Information Technology > Artificial Intelligence (0.62)

Add feedback

Filters

Collaborating Authors

accelerate training

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

AutoAssist: A Framework to Accelerate Training of Deep Neural Networks

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

AutoAssist: A Framework to Accelerate Training of Deep Neural Networks

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Reviews: AutoAssist: A Framework to Accelerate Training of Deep Neural Networks

Reviews: AutoAssist: A Framework to Accelerate Training of Deep Neural Networks

Reviews: Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

AutoAssist: A Framework to Accelerate Training of Deep Neural Networks

Reduce deep learning training time and cost with MosaicML Composer on AWS

59th MDW: Alamo Spark Cell drives innovation throughout the Air Force